Skip to content

Vectorize pvlib.bifacial.utils._vf_ground_sky_2d across surface_tilt #1682

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Mar 18, 2023

Conversation

kandersolar
Copy link
Member

  • Closes Infinite sheds perf improvement: vectorize over surface_tilt #1680
  • I am familiar with the contributing guidelines
  • [ ] Tests added
  • [ ] Updates entries in docs/sphinx/source/reference for API changes.
  • Adds description and name entries in the appropriate "what's new" file in docs/sphinx/source/whatsnew for all changes. Includes link to the GitHub Issue with :issue:`num` or this Pull Request with :pull:`num`. Includes contributor name and/or GitHub username (link with :ghuser:`user`).
  • [ ] New code is fully documented. Includes numpydoc compliant docstrings, examples, and comments where necessary.
  • Pull request is nearly complete and ready for detailed review.
  • Maintainer: Appropriate GitHub Labels (including remote-data) and Milestone are assigned to the Pull Request and linked Issue.

I could not get satisfactory benchmarking results using asv (perhaps this computer is too noisy for asv's defaults?) so I wrote a one-off benchmarking script that yielded the following runtime comparison for infinite_sheds.get_irradiance() with inputs for a generic single-axis tracking simulation:

# timestamps 7bb30ad [s] PR [s] ratio
1 0.021 0.022 1.011
10 0.025 0.023 0.926
100 0.046 0.042 0.905
1000 0.307 0.23 0.748
10000 2.874 2.219 0.772

I don't deal with multidimensional numpy arrays very often. I will happily make changes if what I've done here is unnecessarily clunky or inefficient and someone suggests improvements.

@kandersolar kandersolar closed this Mar 3, 2023
@kandersolar kandersolar reopened this Mar 3, 2023
@kandersolar kandersolar mentioned this pull request Mar 3, 2023
12 tasks
@kandersolar
Copy link
Member Author

It occurred to me that this PR might have objectionably large memory requirements (similar to #1402) so I did some memory profiling and found that a call to get_irradiance() with inputs of length 10000 increased peak memory usage from ~50 MB on main to ~900 MB on commit 0d80d7d. I was able to get that peak down to ~470 MB with some strategic usage of del in 9bf9360:

image

The memory load is still larger than I'd like, but with that reduction I'd say it's an acceptable price to pay for the runtime improvement. I'm curious what others think.

@wholmgren
Copy link
Member

1 year of 1 minute data is about half a million points. If memory usage scales linearly with that dimension, then I think some users will run into a problem.

@kandersolar
Copy link
Member Author

Good point. Well, taking full advantage of numpy's out parameter (and unfortunately making the code substantially less readable) gets the calculation down to 207 MB peak usage for the 10000 points case. For 525600 points, peak usage is 11 GB. I think there is not much room for additional improvement in _vf_ground_sky_2d beyond what is implemented in 6f490ea. Maybe it's worth adding a vectorize=True kwarg that the user can set to False if they want to use the old loopy approach to prevent memory blowup?

Updated plot:

image

@cwhanse
Copy link
Member

cwhanse commented Mar 8, 2023

If we add a vectorize kwarg I'd default to False.

@wholmgren
Copy link
Member

I'm not a numpy expert, especially in regards to performance, but this looks reasonable to me. Perhaps a numba element by element calculation would combine readability, performance, and reasonable memory usage. Perhaps.

@kandersolar
Copy link
Member Author

Ok, I added a new vectorize=False kwarg to infinite sheds. I think I'm satisfied with this PR now.

Here's an asv runtime comparison:

$ asv run --python=same -e -b InfiniteSheds
· Discovering benchmarks
· Running 4 total benchmarks (1 commits * 1 environments * 4 benchmarks)
[  0.00%] ·· Benchmarking existing-pyc__users_kanderso_software_anaconda3_envs_pvlib-dev_python.exe
[ 12.50%] ··· Running (infinite_sheds.InfiniteSheds.time_get_irradiance_fixed--)....
[ 62.50%] ··· infinite_sheds.InfiniteSheds.time_get_irradiance_fixed                                                 ok
[ 62.50%] ··· =========== ============
               vectorize
              ----------- ------------
                  True      79.4±1ms
                 False     78.2±0.4ms
              =========== ============

[ 75.00%] ··· infinite_sheds.InfiniteSheds.time_get_irradiance_poa_fixed                                             ok
[ 75.00%] ··· =========== ============
               vectorize
              ----------- ------------
                  True     39.1±0.5ms
                 False     38.5±0.7ms
              =========== ============

[ 87.50%] ··· infinite_sheds.InfiniteSheds.time_get_irradiance_poa_tracking                                          ok
[ 87.50%] ··· =========== =========
               vectorize
              ----------- ---------
                  True     174±2ms
                 False     272±5ms
              =========== =========

[100.00%] ··· infinite_sheds.InfiniteSheds.time_get_irradiance_tracking                                              ok
[100.00%] ··· =========== ==========
               vectorize
              ----------- ----------
                  True     365±20ms
                 False     552±7ms
              =========== ==========

@kandersolar
Copy link
Member Author

This PR is ready for another look if @cwhanse or @wholmgren (or anyone else, of course) is interested. The new vectorized calculation path is now gated behind an optional vectorize parameter with the original low-memory serial calculation path still the default.

Copy link
Member

@cwhanse cwhanse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kanderso-nrel this PR changes the output (shape of vf) of bifacial.utils._solar_projection_tangent. That's a private function, so I suppose a deprecation is optional. I'm OK without deprecation, but raising the issue in case there are differing opinions.

@kandersolar kandersolar merged commit c1856ac into pvlib:main Mar 18, 2023
@kandersolar kandersolar deleted the vectorize_vf_ground_sk_2d branch March 18, 2023 00:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Infinite sheds perf improvement: vectorize over surface_tilt
3 participants